Skip to content

chore(deps): Update datasets requirement from <4.0,>=3.0.0 to >=3.0.0,<5.0#160

Merged
chichun-charlie-liu merged 1 commit intomainfrom
dependabot/pip/datasets-gte-3.0.0-and-lt-5.0
Jul 10, 2025
Merged

chore(deps): Update datasets requirement from <4.0,>=3.0.0 to >=3.0.0,<5.0#160
chichun-charlie-liu merged 1 commit intomainfrom
dependabot/pip/datasets-gte-3.0.0-and-lt-5.0

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github Jul 9, 2025

Updates the requirements on datasets to permit the latest version.

Release notes

Sourced from datasets's releases.

4.0.0

New Features

  • Add IterableDataset.push_to_hub() by @​lhoestq in huggingface/datasets#7595

    # Build streaming data pipelines in a few lines of code !
    from datasets import load_dataset
    ds = load_dataset(..., streaming=True)
    ds = ds.map(...).filter(...)
    ds.push_to_hub(...)

  • Add num_proc= to .push_to_hub() (Dataset and IterableDataset) by @​lhoestq in huggingface/datasets#7606

    # Faster push to Hub ! Available for both Dataset and IterableDataset
    ds.push_to_hub(..., num_proc=8)
  • New Column object

    # Syntax:
    ds["column_name"]  # datasets.Column([...]) or datasets.IterableColumn(...)
    Iterate on a column:
    for text in ds["text"]:
    ...
    Load one cell without bringing the full column in memory
    first_text = ds["text"][0]  # equivalent to ds[0]["text"]

  • Torchcodec decoding by @​TyTodd in huggingface/datasets#7616

    • Enables streaming only the ranges you need !
    # Don't download full audios/videos when it's not necessary
    # Now with torchcodec it only streams the required ranges/frames:
    from datasets import load_dataset
    ds = load_dataset(..., streaming=True)
    for example in ds:
    video = example["video"]
    frames = video.get_frames_in_range(start=0, stop=6, step=1)  # only stream certain frames

    • Requires torch>=2.7.0 and FFmpeg >= 4

... (truncated)

Commits

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [datasets](https://github.com/huggingface/datasets) to permit the latest version.
- [Release notes](https://github.com/huggingface/datasets/releases)
- [Commits](huggingface/datasets@3.0.0...4.0.0)

---
updated-dependencies:
- dependency-name: datasets
  dependency-version: 4.0.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Jul 9, 2025
@dependabot dependabot Bot added the dependencies Pull requests that update a dependency file label Jul 9, 2025
@dependabot dependabot Bot added the python Pull requests that update python code label Jul 9, 2025
@dependabot dependabot Bot requested a review from tharapalanivel as a code owner July 9, 2025 21:13
@github-actions github-actions Bot added chore and removed dependencies Pull requests that update a dependency file labels Jul 9, 2025
@chichun-charlie-liu chichun-charlie-liu merged commit 67a5e55 into main Jul 10, 2025
12 checks passed
@dependabot dependabot Bot deleted the dependabot/pip/datasets-gte-3.0.0-and-lt-5.0 branch July 10, 2025 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant